AI & Reinforcement Learning

Artificial Intelligence

Deep learning fundamentals and reinforcement learning, with an interactive Q-learning maze that demonstrates the agent–environment loop end to end.

Deep Learning

Multi-layer neural networks trained end-to-end with backpropagation.

Architectures, in one line each

  • MLP — stacked dense layers hk+1 = σ(Wk hk + bk). Universal approximator; no built-in spatial or temporal structure.
  • CNN — weight-shared convolutions exploit translation equivariance; the workhorse for images.
  • Transformer — self-attention softmax(QKT/√d) V replaces recurrence; the basis of modern LLMs and ViTs.

Backpropagation

Gradients propagate via the chain rule through the computation graph:

∂L/∂θ = (∂L/∂z) · (∂z/∂θ)

Modern frameworks (PyTorch, JAX) build the graph dynamically and apply reverse-mode autodiff. Training stability is then a matter of initialisation, normalisation (BatchNorm, LayerNorm), and residual connections.

GitHub projects

Tutorials